Overview

Dataset statistics

Number of variables52
Number of observations150622
Missing cells125075
Missing cells (%)1.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory59.8 MiB
Average record size in memory416.0 B

Variable types

BOOL26
NUM22
CAT4

Reproduction

Analysis started2020-06-09 10:37:55.616627
Analysis finished2020-06-09 10:39:38.013311
Duration1 minute and 42.4 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

sicuday has constant value "1" Constant
saps3day1 has constant value "0" Constant
saps3today has constant value "0" Constant
saps3yesterday has constant value "0" Constant
teachtype has constant value "0" Constant
region has constant value "3" Constant
managementsystem has constant value "1" Constant
var03hspxlos has constant value "0" Constant
admitdiagnosis has a high cardinality: 426 distinct values High cardinality
patientunitstayid is highly correlated with df_indexHigh correlation
df_index is highly correlated with patientunitstayidHigh correlation
day1meds is highly correlated with medsHigh correlation
meds is highly correlated with day1medsHigh correlation
day1verbal is highly correlated with verbalHigh correlation
verbal is highly correlated with day1verbalHigh correlation
day1motor is highly correlated with motorHigh correlation
motor is highly correlated with day1motorHigh correlation
day1eyes is highly correlated with eyesHigh correlation
eyes is highly correlated with day1eyesHigh correlation
day1pao2 is highly correlated with pao2High correlation
pao2 is highly correlated with day1pao2High correlation
day1fio2 is highly correlated with fio2High correlation
fio2 is highly correlated with day1fio2High correlation
day1meds is highly correlated with medsHigh correlation
meds is highly correlated with day1medsHigh correlation
age has 5192 (3.4%) missing values Missing
electivesurgery has 119542 (79.4%) missing values Missing
df_index has unique values Unique
apachepredvarid has unique values Unique
patientunitstayid has unique values Unique

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct count150622
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean81908.17210633241
Minimum0
Maximum163656
Zeros1
Zeros (%)< 0.1%
Memory size1.1 MiB

Quantile statistics

Minimum0
5-th percentile8087.05
Q140954.25
median81845.5
Q3122967.75
95-th percentile155618.95
Maximum163656
Range163656
Interquartile range (IQR)82013.5

Descriptive statistics

Standard deviation47272.76369
Coefficient of variation (CV)0.5771434336
Kurtosis-1.201342145
Mean81908.17211
Median Absolute Deviation (MAD)41008.5
Skewness0.001084073896
Sum1.23371727e+10
Variance2234714187
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
20471< 0.1%
 
668261< 0.1%
 
1139491< 0.1%
 
1119001< 0.1%
 
996101< 0.1%
 
1057531< 0.1%
 
1037041< 0.1%
 
1262311< 0.1%
 
1241821< 0.1%
 
1303251< 0.1%
 
Other values (150612)150612> 99.9%
 
ValueCountFrequency (%) 
01< 0.1%
 
21< 0.1%
 
31< 0.1%
 
41< 0.1%
 
51< 0.1%
 
ValueCountFrequency (%) 
1636561< 0.1%
 
1636551< 0.1%
 
1636531< 0.1%
 
1636521< 0.1%
 
1636511< 0.1%
 

apachepredvarid
Real number (ℝ≥0)

UNIQUE

Distinct count150622
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1207613.7796868982
Minimum11
Maximum2426241
Zeros0
Zeros (%)0.0%
Memory size1.1 MiB

Quantile statistics

Minimum11
5-th percentile105601.5
Q1577757.25
median1233254.5
Q31834596
95-th percentile2320800.8
Maximum2426241
Range2426230
Interquartile range (IQR)1256838.75

Descriptive statistics

Standard deviation717569.9296
Coefficient of variation (CV)0.5942048209
Kurtosis-1.23529001
Mean1207613.78
Median Absolute Deviation (MAD)624627
Skewness-0.01331022758
Sum1.818932027e+11
Variance5.149066039e+11
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
18957921< 0.1%
 
10600151< 0.1%
 
12587001< 0.1%
 
12484591< 0.1%
 
23011291< 0.1%
 
2224061< 0.1%
 
2142101< 0.1%
 
6252301< 0.1%
 
15906751< 0.1%
 
8347491< 0.1%
 
Other values (150612)150612> 99.9%
 
ValueCountFrequency (%) 
111< 0.1%
 
211< 0.1%
 
361< 0.1%
 
511< 0.1%
 
531< 0.1%
 
ValueCountFrequency (%) 
24262411< 0.1%
 
24262241< 0.1%
 
24262131< 0.1%
 
24261841< 0.1%
 
24261621< 0.1%
 

patientunitstayid
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct count150622
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1769139.3832839825
Minimum141168
Maximum3353254
Zeros0
Zeros (%)0.0%
Memory size1.1 MiB

Quantile statistics

Minimum141168
5-th percentile229562.65
Q1969206
median1685783.5
Q32750760.75
95-th percentile3206468.55
Maximum3353254
Range3212086
Interquartile range (IQR)1781554.75

Descriptive statistics

Standard deviation986592.8281
Coefficient of variation (CV)0.5576682298
Kurtosis-1.308267745
Mean1769139.383
Median Absolute Deviation (MAD)904042.5
Skewness0.02576842912
Sum2.664713122e+11
Variance9.733654085e+11
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
28724181< 0.1%
 
24688271< 0.1%
 
31370521< 0.1%
 
10908781< 0.1%
 
31888231< 0.1%
 
23995451< 0.1%
 
11011071< 0.1%
 
16682591< 0.1%
 
15823821< 0.1%
 
26371011< 0.1%
 
Other values (150612)150612> 99.9%
 
ValueCountFrequency (%) 
1411681< 0.1%
 
1411941< 0.1%
 
1411971< 0.1%
 
1412031< 0.1%
 
1412081< 0.1%
 
ValueCountFrequency (%) 
33532541< 0.1%
 
33532511< 0.1%
 
33532351< 0.1%
 
33532261< 0.1%
 
33532161< 0.1%
 

sicuday
Boolean

CONSTANT
REJECTED

Distinct count1
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
1
150622
ValueCountFrequency (%) 
1150622100.0%
 

saps3day1
Boolean

CONSTANT
REJECTED

Distinct count1
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
0
150622
ValueCountFrequency (%) 
0150622100.0%
 

saps3today
Boolean

CONSTANT
REJECTED

Distinct count1
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
0
150622
ValueCountFrequency (%) 
0150622100.0%
 

saps3yesterday
Boolean

CONSTANT
REJECTED

Distinct count1
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
0
150622
ValueCountFrequency (%) 
0150622100.0%
 

gender
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
0
82001
1
68621
ValueCountFrequency (%) 
08200154.4%
 
16862145.6%
 

teachtype
Boolean

CONSTANT
REJECTED

Distinct count1
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
0
150622
ValueCountFrequency (%) 
0150622100.0%
 

region
Categorical

CONSTANT
REJECTED

Distinct count1
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
3
150622
ValueCountFrequency (%) 
3150622100.0%
 

Length

Max length1
Median length1
Mean length1
Min length1

bedcount
Real number (ℝ≥0)

Distinct count68
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.30056698224695
Minimum1
Maximum120
Zeros0
Zeros (%)0.0%
Memory size1.1 MiB

Quantile statistics

Minimum1
5-th percentile10
Q116
median22
Q332
95-th percentile58
Maximum120
Range119
Interquartile range (IQR)16

Descriptive statistics

Standard deviation15.34013714
Coefficient of variation (CV)0.5832626024
Kurtosis2.634098125
Mean26.30056698
Median Absolute Deviation (MAD)8
Skewness1.550578641
Sum3961444
Variance235.3198075
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
12106667.1%
 
1484975.6%
 
2682805.5%
 
1875935.0%
 
2273064.9%
 
2069524.6%
 
1761854.1%
 
2358163.9%
 
1648613.2%
 
3248553.2%
 
Other values (58)7961152.9%
 
ValueCountFrequency (%) 
110< 0.1%
 
26< 0.1%
 
341< 0.1%
 
42560.2%
 
510020.7%
 
ValueCountFrequency (%) 
1201< 0.1%
 
963< 0.1%
 
8423611.6%
 
804< 0.1%
 
762< 0.1%
 

admitsource
Real number (ℝ)

Distinct count9
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.891901581442286
Minimum-1
Maximum8
Zeros0
Zeros (%)0.0%
Memory size1.1 MiB

Quantile statistics

Minimum-1
5-th percentile1
Q14
median8
Q38
95-th percentile8
Maximum8
Range9
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.720631546
Coefficient of variation (CV)0.4617578058
Kurtosis-0.9631144219
Mean5.891901581
Median Absolute Deviation (MAD)0
Skewness-0.8168116447
Sum887450
Variance7.401836009
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
88190754.4%
 
42421516.1%
 
12216314.7%
 
7104656.9%
 
270474.7%
 
637092.5%
 
55840.4%
 
32950.2%
 
-12370.2%
 
ValueCountFrequency (%) 
-12370.2%
 
12216314.7%
 
270474.7%
 
32950.2%
 
42421516.1%
 
ValueCountFrequency (%) 
88190754.4%
 
7104656.9%
 
637092.5%
 
55840.4%
 
42421516.1%
 

graftcount
Real number (ℝ≥0)

Distinct count8
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.998758481496727
Minimum1
Maximum8
Zeros0
Zeros (%)0.0%
Memory size1.1 MiB

Quantile statistics

Minimum1
5-th percentile3
Q13
median3
Q33
95-th percentile3
Maximum8
Range7
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.1792727844
Coefficient of variation (CV)0.05978233509
Kurtosis125.2445515
Mean2.998758481
Median Absolute Deviation (MAD)0
Skewness0.475947163
Sum451679
Variance0.03213873123
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
314834098.5%
 
48170.5%
 
27380.5%
 
14470.3%
 
52280.2%
 
639< 0.1%
 
710< 0.1%
 
83< 0.1%
 
ValueCountFrequency (%) 
14470.3%
 
27380.5%
 
314834098.5%
 
48170.5%
 
52280.2%
 
ValueCountFrequency (%) 
83< 0.1%
 
710< 0.1%
 
639< 0.1%
 
52280.2%
 
48170.5%
 

meds
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct count3
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
0
148328
1
 
1574
-1
 
720
ValueCountFrequency (%) 
014832898.5%
 
115741.0%
 
-17200.5%
 

Length

Max length2
Median length1
Mean length1.004780178
Min length1

verbal
Real number (ℝ)

HIGH CORRELATION

Distinct count6
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.9633519671761097
Minimum-1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size1.1 MiB

Quantile statistics

Minimum-1
5-th percentile1
Q14
median5
Q35
95-th percentile5
Maximum5
Range6
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.634573332
Coefficient of variation (CV)0.4124219461
Kurtosis0.2951436799
Mean3.963351967
Median Absolute Deviation (MAD)0
Skewness-1.330168977
Sum596968
Variance2.671829976
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
59498663.1%
 
12577417.1%
 
41916412.7%
 
350943.4%
 
233102.2%
 
-122941.5%
 
ValueCountFrequency (%) 
-122941.5%
 
12577417.1%
 
233102.2%
 
350943.4%
 
41916412.7%
 
ValueCountFrequency (%) 
59498663.1%
 
41916412.7%
 
350943.4%
 
233102.2%
 
12577417.1%
 

motor
Real number (ℝ)

HIGH CORRELATION

Distinct count7
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.415576741777429
Minimum-1
Maximum6
Zeros0
Zeros (%)0.0%
Memory size1.1 MiB

Quantile statistics

Minimum-1
5-th percentile1
Q16
median6
Q36
95-th percentile6
Maximum6
Range7
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.447528396
Coefficient of variation (CV)0.2672897948
Kurtosis7.570351288
Mean5.415576742
Median Absolute Deviation (MAD)0
Skewness-2.863280855
Sum815705
Variance2.095338458
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
611855478.7%
 
5128648.5%
 
177745.2%
 
477075.1%
 
-122941.5%
 
38950.6%
 
25340.4%
 
ValueCountFrequency (%) 
-122941.5%
 
177745.2%
 
25340.4%
 
38950.6%
 
477075.1%
 
ValueCountFrequency (%) 
611855478.7%
 
5128648.5%
 
477075.1%
 
38950.6%
 
25340.4%
 

eyes
Real number (ℝ)

HIGH CORRELATION

Distinct count5
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.4370277914248915
Minimum-1
Maximum4
Zeros0
Zeros (%)0.0%
Memory size1.1 MiB

Quantile statistics

Minimum-1
5-th percentile1
Q13
median4
Q34
95-th percentile4
Maximum4
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.061591122
Coefficient of variation (CV)0.3088689376
Kurtosis4.084455152
Mean3.437027791
Median Absolute Deviation (MAD)0
Skewness-2.102489131
Sum517692
Variance1.126975711
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
410641870.7%
 
32248214.9%
 
1119888.0%
 
274404.9%
 
-122941.5%
 
ValueCountFrequency (%) 
-122941.5%
 
1119888.0%
 
274404.9%
 
32248214.9%
 
410641870.7%
 
ValueCountFrequency (%) 
410641870.7%
 
32248214.9%
 
274404.9%
 
1119888.0%
 
-122941.5%
 

age
Real number (ℝ≥0)

MISSING

Distinct count71
Unique (%)< 0.1%
Missing5192
Missing (%)3.4%
Infinite0
Infinite (%)0.0%
Mean62.121556762703705
Minimum19.0
Maximum89.0
Zeros0
Zeros (%)0.0%
Memory size1.1 MiB

Quantile statistics

Minimum19
5-th percentile30
Q152
median64
Q375
95-th percentile85
Maximum89
Range70
Interquartile range (IQR)23

Descriptive statistics

Standard deviation16.39758629
Coefficient of variation (CV)0.2639596807
Kurtosis-0.2648048415
Mean62.12155676
Median Absolute Deviation (MAD)11
Skewness-0.5742610856
Sum9034338
Variance268.8808361
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
6738082.5%
 
6836312.4%
 
7135712.4%
 
7235462.4%
 
6635322.3%
 
6534932.3%
 
6333942.3%
 
7033822.2%
 
6232882.2%
 
6432652.2%
 
Other values (61)11052073.4%
 
(Missing)51923.4%
 
ValueCountFrequency (%) 
195660.4%
 
205360.4%
 
216250.4%
 
226300.4%
 
236440.4%
 
ValueCountFrequency (%) 
8914331.0%
 
8816431.1%
 
8718701.2%
 
8620321.3%
 
8521991.5%
 

admitdiagnosis
Categorical

HIGH CARDINALITY

Distinct count426
Unique (%)0.3%
Missing341
Missing (%)0.2%
Memory size1.1 MiB
SEPSISPULM
 
7526
AMI
 
6263
CVASTROKE
 
5800
CHF
 
5548
SEPSISUTI
 
4614
Other values (421)
120530
ValueCountFrequency (%) 
SEPSISPULM75265.0%
 
AMI62634.2%
 
CVASTROKE58003.9%
 
CHF55483.7%
 
SEPSISUTI46143.1%
 
DKA42962.9%
 
S-CABG42742.8%
 
RHYTHATR39632.6%
 
EMPHYSBRON38102.5%
 
PNEUMBACT34302.3%
 
Other values (416)10075766.9%
 

Length

Max length10
Median length9
Mean length8.102109918
Min length3
Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
0
148105
1
 
2517
ValueCountFrequency (%) 
014810598.3%
 
125171.7%
 
Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
0
139280
1
 
11342
ValueCountFrequency (%) 
013928092.5%
 
1113427.5%
 

aids
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
0
150448
1
 
174
ValueCountFrequency (%) 
015044899.9%
 
11740.1%
 
Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
0
148137
1
 
2485
ValueCountFrequency (%) 
014813798.4%
 
124851.6%
 

lymphoma
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
0
149913
1
 
709
ValueCountFrequency (%) 
014991399.5%
 
17090.5%
 
Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
0
147464
1
 
3158
ValueCountFrequency (%) 
014746497.9%
 
131582.1%
 

leukemia
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
0
149495
1
 
1127
ValueCountFrequency (%) 
014949599.3%
 
111270.7%
 
Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
0
146464
1
 
4158
ValueCountFrequency (%) 
014646497.2%
 
141582.8%
 

cirrhosis
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
0
147746
1
 
2876
ValueCountFrequency (%) 
014774698.1%
 
128761.9%
 

electivesurgery
Boolean

MISSING

Distinct count2
Unique (%)< 0.1%
Missing119542
Missing (%)79.4%
Memory size1.1 MiB
1
27837
0
 
3243
(Missing)
119542
ValueCountFrequency (%) 
12783718.5%
 
032432.2%
 
(Missing)11954279.4%
 

activetx
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
1
88220
0
62402
ValueCountFrequency (%) 
18822058.6%
 
06240241.4%
 

readmit
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
0
142887
1
 
7735
ValueCountFrequency (%) 
014288794.9%
 
177355.1%
 

ima
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
0
145861
1
 
4761
ValueCountFrequency (%) 
014586196.8%
 
147613.2%
 

midur
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
0
149205
1
 
1417
ValueCountFrequency (%) 
014920599.1%
 
114170.9%
 

ventday1
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
0
114051
1
36571
ValueCountFrequency (%) 
011405175.7%
 
13657124.3%
 
Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
0
100565
1
50057
ValueCountFrequency (%) 
010056566.8%
 
15005733.2%
 
Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
0
111335
1
39287
ValueCountFrequency (%) 
011133573.9%
 
13928726.1%
 

diabetes
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
0
114559
1
36063
ValueCountFrequency (%) 
011455976.1%
 
13606323.9%
 

managementsystem
Boolean

CONSTANT
REJECTED

Distinct count1
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
1
150622
ValueCountFrequency (%) 
1150622100.0%
 

var03hspxlos
Boolean

CONSTANT
REJECTED

Distinct count1
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
0
150622
ValueCountFrequency (%) 
0150622100.0%
 

pao2
Real number (ℝ)

HIGH CORRELATION

Distinct count2262
Unique (%)1.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29.59914487923411
Minimum-1.0
Maximum636.0
Zeros0
Zeros (%)0.0%
Memory size1.1 MiB

Quantile statistics

Minimum-1
5-th percentile-1
Q1-1
median-1
Q3-1
95-th percentile165
Maximum636
Range637
Interquartile range (IQR)0

Descriptive statistics

Standard deviation69.0494508
Coefficient of variation (CV)2.332819109
Kurtosis11.91637695
Mean29.59914488
Median Absolute Deviation (MAD)0
Skewness3.089035564
Sum4458282.4
Variance4767.826655
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
-111567176.8%
 
763400.2%
 
803310.2%
 
753300.2%
 
703250.2%
 
783240.2%
 
793230.2%
 
713220.2%
 
733220.2%
 
743190.2%
 
Other values (2252)3201521.3%
 
ValueCountFrequency (%) 
-111567176.8%
 
91< 0.1%
 
171< 0.1%
 
17.31< 0.1%
 
182< 0.1%
 
ValueCountFrequency (%) 
6361< 0.1%
 
6201< 0.1%
 
6071< 0.1%
 
6021< 0.1%
 
601.61< 0.1%
 

fio2
Real number (ℝ)

HIGH CORRELATION

Distinct count89
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.802516232688452
Minimum-1.0
Maximum100.0
Zeros0
Zeros (%)0.0%
Memory size1.1 MiB

Quantile statistics

Minimum-1
5-th percentile-1
Q1-1
median-1
Q3-1
95-th percentile100
Maximum100
Range101
Interquartile range (IQR)0

Descriptive statistics

Standard deviation28.05289866
Coefficient of variation (CV)2.19120196
Kurtosis2.822377679
Mean12.80251623
Median Absolute Deviation (MAD)0
Skewness1.981918623
Sum1928340.6
Variance786.9651231
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
-111567176.8%
 
10075765.0%
 
4065084.3%
 
5062234.1%
 
6034422.3%
 
3019821.3%
 
2115131.0%
 
7012790.8%
 
8012710.8%
 
359480.6%
 
Other values (79)42092.8%
 
ValueCountFrequency (%) 
-111567176.8%
 
2115131.0%
 
226< 0.1%
 
2331< 0.1%
 
2471< 0.1%
 
ValueCountFrequency (%) 
10075765.0%
 
99.61< 0.1%
 
9911< 0.1%
 
98.81< 0.1%
 
98.52< 0.1%
 

ejectfx
Real number (ℝ)

Distinct count69
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.010357052754577685
Minimum-1
Maximum88
Zeros7
Zeros (%)< 0.1%
Memory size1.1 MiB

Quantile statistics

Minimum-1
5-th percentile-1
Q1-1
median-1
Q3-1
95-th percentile-1
Maximum88
Range89
Interquartile range (IQR)0

Descriptive statistics

Standard deviation7.424369575
Coefficient of variation (CV)716.8419193
Kurtosis56.69275314
Mean0.01035705275
Median Absolute Deviation (MAD)0
Skewness7.537047642
Sum1560
Variance55.12126359
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
-114769698.1%
 
605830.4%
 
555320.4%
 
652790.2%
 
502680.2%
 
402080.1%
 
452050.1%
 
351420.1%
 
301130.1%
 
70930.1%
 
Other values (59)5030.3%
 
ValueCountFrequency (%) 
-114769698.1%
 
07< 0.1%
 
12< 0.1%
 
1011< 0.1%
 
121< 0.1%
 
ValueCountFrequency (%) 
881< 0.1%
 
821< 0.1%
 
8010< 0.1%
 
784< 0.1%
 
771< 0.1%
 

creatinine
Real number (ℝ)

Distinct count1624
Unique (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.053861972354636
Minimum-1.0
Maximum24.95
Zeros0
Zeros (%)0.0%
Memory size1.1 MiB

Quantile statistics

Minimum-1
5-th percentile-1
Q10.51
median0.83
Q31.38
95-th percentile4.18
Maximum24.95
Range25.95
Interquartile range (IQR)0.87

Descriptive statistics

Standard deviation1.856314306
Coefficient of variation (CV)1.76143969
Kurtosis17.70543987
Mean1.053861972
Median Absolute Deviation (MAD)0.43
Skewness3.065964091
Sum158734.798
Variance3.445902804
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
-12933119.5%
 
0.838752.6%
 
0.737952.5%
 
0.930322.0%
 
0.629832.0%
 
1.120561.4%
 
119721.3%
 
0.518391.2%
 
1.217781.2%
 
1.315311.0%
 
Other values (1614)9843065.3%
 
ValueCountFrequency (%) 
-12933119.5%
 
0.114< 0.1%
 
0.113< 0.1%
 
0.125< 0.1%
 
0.133< 0.1%
 
ValueCountFrequency (%) 
24.951< 0.1%
 
24.61< 0.1%
 
24.31< 0.1%
 
23.91< 0.1%
 
23.871< 0.1%
 

dischargelocation
Real number (ℝ)

Distinct count7
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.319189759796045
Minimum-1
Maximum9
Zeros0
Zeros (%)0.0%
Memory size1.1 MiB

Quantile statistics

Minimum-1
5-th percentile4
Q14
median4
Q37
95-th percentile8
Maximum9
Range10
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.852318566
Coefficient of variation (CV)0.3482332178
Kurtosis-1.07876761
Mean5.31918976
Median Absolute Deviation (MAD)0
Skewness0.7210437704
Sum801187
Variance3.431084071
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
49656664.1%
 
83019320.0%
 
7133668.9%
 
962694.2%
 
633722.2%
 
56700.4%
 
-11860.1%
 
ValueCountFrequency (%) 
-11860.1%
 
49656664.1%
 
56700.4%
 
633722.2%
 
7133668.9%
 
ValueCountFrequency (%) 
962694.2%
 
83019320.0%
 
7133668.9%
 
633722.2%
 
56700.4%
 

visitnumber
Real number (ℝ≥0)

Distinct count8
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.0633307219396901
Minimum1
Maximum8
Zeros0
Zeros (%)0.0%
Memory size1.1 MiB

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile2
Maximum8
Range7
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.2851711407
Coefficient of variation (CV)0.2681866843
Kurtosis49.70515778
Mean1.063330722
Median Absolute Deviation (MAD)0
Skewness5.837860666
Sum160161
Variance0.08132257948
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
114236094.5%
 
272504.8%
 
38240.5%
 
41380.1%
 
532< 0.1%
 
611< 0.1%
 
75< 0.1%
 
82< 0.1%
 
ValueCountFrequency (%) 
114236094.5%
 
272504.8%
 
38240.5%
 
41380.1%
 
532< 0.1%
 
ValueCountFrequency (%) 
82< 0.1%
 
75< 0.1%
 
611< 0.1%
 
532< 0.1%
 
41380.1%
 

amilocation
Real number (ℝ)

Distinct count8
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-0.7672385176136288
Minimum-1
Maximum7
Zeros0
Zeros (%)0.0%
Memory size1.1 MiB

Quantile statistics

Minimum-1
5-th percentile-1
Q1-1
median-1
Q3-1
95-th percentile-1
Maximum7
Range8
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.143110242
Coefficient of variation (CV)-1.489902052
Kurtosis25.08786721
Mean-0.7672385176
Median Absolute Deviation (MAD)0
Skewness5.087248112
Sum-115563
Variance1.306701026
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
-114378695.5%
 
625931.7%
 
420321.3%
 
113620.9%
 
23360.2%
 
52820.2%
 
31310.1%
 
71000.1%
 
ValueCountFrequency (%) 
-114378695.5%
 
113620.9%
 
23360.2%
 
31310.1%
 
420321.3%
 
ValueCountFrequency (%) 
71000.1%
 
625931.7%
 
52820.2%
 
420321.3%
 
31310.1%
 

day1meds
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct count3
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
0
148328
1
 
1574
-1
 
720
ValueCountFrequency (%) 
014832898.5%
 
115741.0%
 
-17200.5%
 

Length

Max length2
Median length1
Mean length1.004780178
Min length1

day1verbal
Real number (ℝ)

HIGH CORRELATION

Distinct count6
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.9633519671761097
Minimum-1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size1.1 MiB

Quantile statistics

Minimum-1
5-th percentile1
Q14
median5
Q35
95-th percentile5
Maximum5
Range6
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.634573332
Coefficient of variation (CV)0.4124219461
Kurtosis0.2951436799
Mean3.963351967
Median Absolute Deviation (MAD)0
Skewness-1.330168977
Sum596968
Variance2.671829976
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
59498663.1%
 
12577417.1%
 
41916412.7%
 
350943.4%
 
233102.2%
 
-122941.5%
 
ValueCountFrequency (%) 
-122941.5%
 
12577417.1%
 
233102.2%
 
350943.4%
 
41916412.7%
 
ValueCountFrequency (%) 
59498663.1%
 
41916412.7%
 
350943.4%
 
233102.2%
 
12577417.1%
 

day1motor
Real number (ℝ)

HIGH CORRELATION

Distinct count7
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.415576741777429
Minimum-1
Maximum6
Zeros0
Zeros (%)0.0%
Memory size1.1 MiB

Quantile statistics

Minimum-1
5-th percentile1
Q16
median6
Q36
95-th percentile6
Maximum6
Range7
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.447528396
Coefficient of variation (CV)0.2672897948
Kurtosis7.570351288
Mean5.415576742
Median Absolute Deviation (MAD)0
Skewness-2.863280855
Sum815705
Variance2.095338458
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
611855478.7%
 
5128648.5%
 
177745.2%
 
477075.1%
 
-122941.5%
 
38950.6%
 
25340.4%
 
ValueCountFrequency (%) 
-122941.5%
 
177745.2%
 
25340.4%
 
38950.6%
 
477075.1%
 
ValueCountFrequency (%) 
611855478.7%
 
5128648.5%
 
477075.1%
 
38950.6%
 
25340.4%
 

day1eyes
Real number (ℝ)

HIGH CORRELATION

Distinct count5
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.4370277914248915
Minimum-1
Maximum4
Zeros0
Zeros (%)0.0%
Memory size1.1 MiB

Quantile statistics

Minimum-1
5-th percentile1
Q13
median4
Q34
95-th percentile4
Maximum4
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.061591122
Coefficient of variation (CV)0.3088689376
Kurtosis4.084455152
Mean3.437027791
Median Absolute Deviation (MAD)0
Skewness-2.102489131
Sum517692
Variance1.126975711
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
410641870.7%
 
32248214.9%
 
1119888.0%
 
274404.9%
 
-122941.5%
 
ValueCountFrequency (%) 
-122941.5%
 
1119888.0%
 
274404.9%
 
32248214.9%
 
410641870.7%
 
ValueCountFrequency (%) 
410641870.7%
 
32248214.9%
 
274404.9%
 
1119888.0%
 
-122941.5%
 

day1pao2
Real number (ℝ)

HIGH CORRELATION

Distinct count2262
Unique (%)1.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29.59914487923411
Minimum-1.0
Maximum636.0
Zeros0
Zeros (%)0.0%
Memory size1.1 MiB

Quantile statistics

Minimum-1
5-th percentile-1
Q1-1
median-1
Q3-1
95-th percentile165
Maximum636
Range637
Interquartile range (IQR)0

Descriptive statistics

Standard deviation69.0494508
Coefficient of variation (CV)2.332819109
Kurtosis11.91637695
Mean29.59914488
Median Absolute Deviation (MAD)0
Skewness3.089035564
Sum4458282.4
Variance4767.826655
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
-111567176.8%
 
763400.2%
 
803310.2%
 
753300.2%
 
703250.2%
 
783240.2%
 
793230.2%
 
713220.2%
 
733220.2%
 
743190.2%
 
Other values (2252)3201521.3%
 
ValueCountFrequency (%) 
-111567176.8%
 
91< 0.1%
 
171< 0.1%
 
17.31< 0.1%
 
182< 0.1%
 
ValueCountFrequency (%) 
6361< 0.1%
 
6201< 0.1%
 
6071< 0.1%
 
6021< 0.1%
 
601.61< 0.1%
 

day1fio2
Real number (ℝ)

HIGH CORRELATION

Distinct count89
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.802516232688452
Minimum-1.0
Maximum100.0
Zeros0
Zeros (%)0.0%
Memory size1.1 MiB

Quantile statistics

Minimum-1
5-th percentile-1
Q1-1
median-1
Q3-1
95-th percentile100
Maximum100
Range101
Interquartile range (IQR)0

Descriptive statistics

Standard deviation28.05289866
Coefficient of variation (CV)2.19120196
Kurtosis2.822377679
Mean12.80251623
Median Absolute Deviation (MAD)0
Skewness1.981918623
Sum1928340.6
Variance786.9651231
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
-111567176.8%
 
10075765.0%
 
4065084.3%
 
5062234.1%
 
6034422.3%
 
3019821.3%
 
2115131.0%
 
7012790.8%
 
8012710.8%
 
359480.6%
 
Other values (79)42092.8%
 
ValueCountFrequency (%) 
-111567176.8%
 
2115131.0%
 
226< 0.1%
 
2331< 0.1%
 
2471< 0.1%
 
ValueCountFrequency (%) 
10075765.0%
 
99.61< 0.1%
 
9911< 0.1%
 
98.81< 0.1%
 
98.52< 0.1%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

df_indexapachepredvaridpatientunitstayidsicudaysaps3day1saps3todaysaps3yesterdaygenderteachtyperegionbedcountadmitsourcegraftcountmedsverbalmotoreyesageadmitdiagnosisthrombolyticsdiedinhospitalaidshepaticfailurelymphomametastaticcancerleukemiaimmunosuppressioncirrhosiselectivesurgeryactivetxreadmitimamidurventday1oobventday1oobintubday1diabetesmanagementsystemvar03hspxlospao2fio2ejectfxcreatininedischargelocationvisitnumberamilocationday1medsday1verbalday1motorday1eyesday1pao2day1fio2
00179489514116810001031273056470.0RHYTHATR010000000NaN1000000010-1.0-1.0-12.3091-10564-1.0-1.0
12179092314119410000033843046368.0SEPSISUTI000000000NaN0000000110-1.0-1.0-12.5141-10463-1.0-1.0
232779914119710000033083056471.0SEPSISPULM000000000NaN0000000010-1.0-1.0-1-1.0041-10564-1.0-1.0
34240643214120310001031843013177.0RESPARREST000000000NaN100011011051.0100.0-10.5641-1013151.0100.0
45148912814120810001031483056325.0ODSEDHYP000000000NaN0000000010-1.0-1.0-1-1.0071-10563-1.0-1.0
5617869531412271000003943046382.0SEPSISPULM000000000NaN100011001065.021.0-11.9061-1046365.021.0
677948141229100010343830564NaNCHF000000000NaN1000110010-1.0-1.0-1-1.0041-10564-1.0-1.0
78240245914123310001033813056481.0S-VALVMI0000000001.01000111010142.060.025-1.0041-10564142.060.0
89118336014124410000033813056459.0S-FEMPGRAF0000000001.00000000010-1.0-1.0-10.6541-10564-1.0-1.0
9101191614126010001031983056443.0ASTHMA000000000NaN0000000010-1.0-1.0-11.0441-10564-1.0-1.0

Last rows

df_indexapachepredvaridpatientunitstayidsicudaysaps3day1saps3todaysaps3yesterdaygenderteachtyperegionbedcountadmitsourcegraftcountmedsverbalmotoreyesageadmitdiagnosisthrombolyticsdiedinhospitalaidshepaticfailurelymphomametastaticcancerleukemiaimmunosuppressioncirrhosiselectivesurgeryactivetxreadmitimamidurventday1oobventday1oobintubday1diabetesmanagementsystemvar03hspxlospao2fio2ejectfxcreatininedischargelocationvisitnumberamilocationday1medsday1verbalday1motorday1eyesday1pao2day1fio2
150612163646821992335319710001032012036466.0S-CABGAOV0000000001.01010011010380.0100.0-10.7141-10364380.0100.0
1506131636472418338335319810001031143014266.0COMA000000000NaN110011101052.050.0-11.0144-1014252.050.0
150614163648825957335320010001032343056466.0HYPOVOLEM000000000NaN110011101070.030.0-10.9245-1056470.030.0
150615163649825958335320110001032343056366.0PLEUREFFUS000000000NaN1100111010329.0100.0-1-1.0043-10563329.0100.0
15061616365024143663353213100010311831-1-1-151.0COMA000000000NaN1000111010108.045.0-10.6871-11-1-1-1108.045.0
1506171636512108598335321610001031413015150.0S-CYSTOTH0000000001.01000111010226.0100.0-10.7371-10151226.0100.0
1506181636528219943353226100010323831-1-1-179.0PLEUREFFUS010000000NaN1000111110-1.0-1.0-1-1.0091-11-1-1-1-1.0-1.0
150619163653825962335323510000032383056450.0CHF000000000NaN0000000010-1.0-1.0-1-1.0081-10564-1.0-1.0
150620163655917301335325110000032083011173.0CARDARREST000000000NaN100011111080.0100.0-12.4381-1011180.0100.0
1506211636562418339335325410000031483056481.0LOWGIBLEED000000000NaN1000000010-1.0-1.0-1-1.0041-10564-1.0-1.0